15 research outputs found
Composing Graph Theory and Deep Neural Networks to Evaluate SEU Type Soft Error Effects
Rapidly shrinking technology node and voltage scaling increase the
susceptibility of Soft Errors in digital circuits. Soft Errors are
radiation-induced effects while the radiation particles such as Alpha, Neutrons
or Heavy Ions, interact with sensitive regions of microelectronic
devices/circuits. The particle hit could be a glancing blow or a penetrating
strike. A well apprehended and characterized way of analyzing soft error
effects is the fault-injection campaign, but that typically acknowledged as
time and resource-consuming simulation strategy. As an alternative to
traditional fault injection-based methodologies and to explore the
applicability of modern graph based neural network algorithms in the field of
reliability modeling, this paper proposes a systematic framework that explores
gate-level abstractions to extract and exploit relevant feature representations
at low-dimensional vector space. The framework allows the extensive prediction
analysis of SEU type soft error effects in a given circuit. A scalable and
inductive type representation learning algorithm on graphs called GraphSAGE has
been utilized for efficiently extracting structural features of the gate-level
netlist, providing a valuable database to exercise a downstream machine
learning or deep learning algorithm aiming at predicting fault propagation
metrics. Functional Failure Rate (FFR): the predicted fault propagating metric
of SEU type fault within the gate-level circuit abstraction of the 10-Gigabit
Ethernet MAC (IEEE 802.3) standard circuit.Comment: 5 pages for conference, Number of figures: 3, Conference: 2020 9th
Mediterranean Conference on Embedded Computing (MECO
Machine Learning to Tackle the Challenges of Transient and Soft Errors in Complex Circuits
The Functional Failure Rate analysis of today's complex circuits is a
difficult task and requires a significant investment in terms of human efforts,
processing resources and tool licenses. Thereby, de-rating or vulnerability
factors are a major instrument of failure analysis efforts. Usually
computationally intensive fault-injection simulation campaigns are required to
obtain a fine-grained reliability metrics for the functional level. Therefore,
the use of machine learning algorithms to assist this procedure and thus,
optimising and enhancing fault injection efforts, is investigated in this
paper. Specifically, machine learning models are used to predict accurate
per-instance Functional De-Rating data for the full list of circuit instances,
an objective that is difficult to reach using classical methods. The described
methodology uses a set of per-instance features, extracted through an analysis
approach, combining static elements (cell properties, circuit structure,
synthesis attributes) and dynamic elements (signal activity). Reference data is
obtained through first-principles fault simulation approaches. One part of this
reference dataset is used to train the machine learning model and the remaining
is used to validate and benchmark the accuracy of the trained tool. The
presented methodology is applied on a practical example and various machine
learning models are evaluated and compared
Machine Learning Clustering Techniques for Selective Mitigation of Critical Design Features
Selective mitigation or selective hardening is an effective technique to
obtain a good trade-off between the improvements in the overall reliability of
a circuit and the hardware overhead induced by the hardening techniques.
Selective mitigation relies on preferentially protecting circuit instances
according to their susceptibility and criticality. However, ranking circuit
parts in terms of vulnerability usually requires computationally intensive
fault-injection simulation campaigns. This paper presents a new methodology
which uses machine learning clustering techniques to group flip-flops with
similar expected contributions to the overall functional failure rate, based on
the analysis of a compact set of features combining attributes from static
elements and dynamic elements. Fault simulation campaigns can then be executed
on a per-group basis, significantly reducing the time and cost of the
evaluation. The effectiveness of grouping similar sensitive flip-flops by
machine learning clustering algorithms is evaluated on a practical
example.Different clustering algorithms are applied and the results are
compared to an ideal selective mitigation obtained by exhaustive
fault-injection simulation
Radiation-Hardening-By-Design (RHDB) and modeling of single event effects in digital circuits manufactured in Bulk 65 nm and FDSOI 28 nm
La miniaturisation des circuits intégrés numériques tend à augmenter leur sensibilité aux radiations. Ainsi le rayonnement naturel peut induire des événements singuliers et porter atteinte à la fiabilité des circuits.Cette thèse porte sur la modélisation des mécanismes à l'origine de ces événements singuliers et sur le développement de solutions de durcissement par conception permettant de limiter l'impact des radiations sur le taux d'erreur.Dans une première partie, nous avons notamment développé une approche dénommée RWDD (Random-Walk Drift- Diffusion) modélisant le transport et la collection de charges au sein d'un circuit, sur la base d'équations physiques sans paramètre d'ajustement. Ce modèle particulaire et sa résolution numérique transitoire permettent de coupler le transport des charges avec un simulateur circuit, tenant ainsi compte de l'évolution temporelle des champs électriques dans la structure. Le modèle RWDD a été intégré avec succès dans une plateforme de simulation capable d'estimer la réponse d'un circuit suite à l'impact d'une particule ionisante.Dans une seconde partie, des solutions de durcissement permettant de limiter l'impact des radiations sur la fiabilité des circuits ont été développées. A l'échelle des cellules élémentaires, de nouvelles bascules robustes aux radiations ont été proposées, en limitant leur impact les performances. Au niveau système, une méthodologie de duplication de l'arbre d'horloge a été développée. Enfin, un flot de triplication a été conçu pour les systèmes dont la fiabilité est critique. L'ensemble de ces solutions a été implémenté en technologie 65 nm et UTBB-FDSOI 28 nm et leur efficacité vérifiée expérimentalement.The extreme technology scaling of digital circuits leads to increase their sensitivity to ionizing radiation, whether in spatial or terrestrial environments. Natural radiation can now induce single event effects in deca-nanometer circuits and impact their reliability.This thesis focuses on the modeling of single event mechanisms and the development of hardening by design solutions that mitigate radiation threat on the circuit error rate.In a first part of this work, we have developed a physical model for both the transport and collection of radiation-induced charges in a biased circuit, derived from pure physics-based equations without any fitting parameter. This model is called Random-Walk Drift-Diffusion (RWDD). This particle-level model and its numerical transient solving allows the coupling of the charge collection process with a circuit simulator, taking into account the time variations of the electrical fields in the structure. The RWDD model is able to simulate the behavior of a circuit following a radiation impact, independently of the implemented function and the considered technology.In a second part of our work, hardening solutions that limit radiation impacts on circuit reliability have been developed. At elementary cell level, new radiation-hardened latch architectures have been proposed, with a limited impact on performances. At system level, a clock tree duplication methodology has been proposed, leaning on specific latches. Finally, a triplication flow has been design for critical applications. All these solutions have been implemented in 65 nm and UTBB-FDSOI 28nm technologies and radiation test have been performed to measure their hardening efficiency
Functional Failure Rate Due to Single-Event Transients in Clock Distribution Networks
With technology scaling, lower supply voltages, and higher operating
frequencies clock distribution networks become more and more vulnerable to
transients faults. These faults can cause circuit-wide effects and thus,
significantly contribute to the functional failure rate of the circuit. This
paper proposes a methodology to analyse how the functional behaviour is
affected by Single-Event Transients in the clock distribution network. The
approach is based on logic-level simulation and thus, only uses the
register-transfer level description of a design. Therefore, a fault model is
proposed which implements the main effects due to radiation-induced transients
in the clock network. This fault model enables the computation of the
functional failure rate caused by Single-Event Transients for each individual
clock buffer, as well as the complete network. Further, it allows the
identification of the most vulnerable flip-flops related to Single-Event
Transients in the clock network.
The proposed methodology is applied in a practical example and a fault
injection campaign is performed. In order to evaluate the impact of
Single-Event Transients in clock distribution networks, the obtained functional
failure rate is compared to the error rate caused by Single-Event Upsets in the
sequential logic
On the Estimation of Complex Circuits Functional Failure Rate by Machine Learning Techniques
De-Rating or Vulnerability Factors are a major feature of failure analysis
efforts mandated by today's Functional Safety requirements. Determining the
Functional De-Rating of sequential logic cells typically requires
computationally intensive fault-injection simulation campaigns. In this paper a
new approach is proposed which uses Machine Learning to estimate the Functional
De-Rating of individual flip-flops and thus, optimising and enhancing fault
injection efforts. Therefore, first, a set of per-instance features is
described and extracted through an analysis approach combining static elements
(cell properties, circuit structure, synthesis attributes) and dynamic elements
(signal activity). Second, reference data is obtained through first-principles
fault simulation approaches. Finally, one part of the reference dataset is used
to train the Machine Learning algorithm and the remaining is used to validate
and benchmark the accuracy of the trained tool. The intended goal is to obtain
a trained model able to provide accurate per-instance Functional De-Rating data
for the full list of circuit instances, an objective that is difficult to reach
using classical methods. The presented methodology is accompanied by a
practical example to determine the performance of various Machine Learning
models for different training sizes.Comment: arXiv admin note: text overlap with arXiv:2002.0888
Machine Learning To Tackle the Challenges of Transient and Soft Errors in Complex Circuits
The Functional Failure Rate analysis of today's complex circuits is a
difficult task and requires a significant investment in terms of human efforts,
processing resources and tool licenses. Thereby, de-rating or vulnerability
factors are a major instrument of failure analysis efforts. Usually
computationally intensive fault-injection simulation campaigns are required to
obtain a fine-grained reliability metrics for the functional level. Therefore,
the use of machine learning algorithms to assist this procedure and thus,
optimising and enhancing fault injection efforts, is investigated in this
paper. Specifically, machine learning models are used to predict accurate
per-instance Functional De-Rating data for the full list of circuit instances,
an objective that is difficult to reach using classical methods. The described
methodology uses a set of per-instance features, extracted through an analysis
approach, combining static elements (cell properties, circuit structure,
synthesis attributes) and dynamic elements (signal activity). Reference data is
obtained through first-principles fault simulation approaches. One part of this
reference dataset is used to train the machine learning model and the remaining
is used to validate and benchmark the accuracy of the trained tool. The
presented methodology is applied on a practical example and various machine
learning models are evaluated and compared